Kernel: Python (ds_env)

The properties determine a hazardous asteroid

Our project is about finding the properties that determine if an asteroid is hazardous or not. Using data collected by NASA's Open API and published by Sameep Vani, we created multiple graphs and charts in order to test our hypothesis and come up with a credible conclusion.

Our hypothesis is that if the relative speed is large and the size of an asteroid is small, the asteroid may be considered harmful.

In [1]:
import pandas as pd import numpy as np import plotly.express as px
In [2]:
earth_object = pd.read_csv('earth_object.csv') earth_object.head()
Out[2]:
id name est_diameter_min est_diameter_max relative_velocity miss_distance orbiting_body sentry_object absolute_magnitude hazardous
0 2162635 162635 (2000 SS164) 1.198271 2.679415 13569.249224 5.483974e+07 Earth False 16.73 False
1 2277475 277475 (2005 WK4) 0.265800 0.594347 73588.726663 6.143813e+07 Earth False 20.00 True
2 2512244 512244 (2015 YE18) 0.722030 1.614507 114258.692129 4.979872e+07 Earth False 17.83 False
3 3596030 (2012 BV13) 0.096506 0.215794 24764.303138 2.543497e+07 Earth False 22.20 False
4 3667127 (2014 GE35) 0.255009 0.570217 42737.733765 4.627557e+07 Earth False 20.09 True

In the timeline, the pattern of most number of harmful asteroids in years and what are their size and speed

First, we added another column: year. This was used to identify which year has the highest chances of potentially being harmful to Earth. We also dropped the columns' id and name because they were unnecessary. We separated the column year 2021 from the other columns because it has the highest chance of asteroids colliding with Earth to go further into detail by using a pie chart.

The histogram shows the relative velocity of all the asteroids and the years that it has been recorded, ranging from 1898 - 2021. We have shown that even though the year 2021 had the highest chance, it was still relatively low and that the highest velocity is about 426 KM/second. The histogram near the very bottom of this section displays the size, relative velocity, and hazardous chances. The biggest asteroid is about 30KM.

In [14]:
# deal with year variable earth_object['year'] = 0 def years(row): temp = row['name'] a = temp.split('(') year = a[1].split(' ')[0] row['year'] = year return row earth_object = earth_object.apply(years, axis = 1) earth_object.loc[earth_object.year=='A911','year']='1911' earth_object.loc[earth_object.year=='6743','year']='1960' earth_object.loc[earth_object.year=='A898','year']='1898' earth_object.loc[earth_object.year=='6344','year']='1960' earth_object.loc[earth_object.year=='A924','year']='1924' earth_object.loc[earth_object.year=='A/2019','year']='2019' earth_object.loc[earth_object.year=='4788','year']='1960' # drop id and name columns_to_drop = ['id', 'name'] earth_object.drop(columns_to_drop, axis=1, inplace=True) # sort the data by year earth_object = earth_object.sort_values(by=['year'])
In [29]:
fig = px.histogram(earth_object, x="year", y="relative_velocity", title="asteroid harmfulness") fig.show()
Out[29]:
In [20]:
# further explore the asteroids in 2021 most_asteroids = earth_object[earth_object['year'] == '2021'] fig = px.pie(most_asteroids, values="relative_velocity", names = 'hazardous', title = '2021 hazardous level') fig.show()
Out[20]:
In [33]:
fig = px.histogram(most_asteroids, x = 'relative_velocity', y = 'est_diameter_max', color = 'hazardous', title = '2021 asteroid size True') fig.show()
Out[33]:

Relation between Average Diameter and Relative Velocity to Asteroid Hazard

To find the difference in velocity between the Hazardous and Non Hazardous asteroids, we used a box plot to compare since we can view them side by side and view the measurements. We can see that the hazardous asteroids have a higher median, min, and max for velocity than the non hazardous asteroids.

In [34]:
fig = px.box(earth_object, x="hazardous", y="relative_velocity", color = "hazardous", title = "Relative velocity for Hazardous vs Non Hazardous Asteroids") fig.show()
Out[34]:

In the next graph, we wanted to see if there was a difference in the relationship between the average diameter and relative velocity of the hazardous vs non hazardous asteroids using a scatter plot. The scatter plot showed no significant difference between the 2 kinds of asteroids. However, we can see that average diameter does not play a role in determining the relative velocity of the asteroid.

In [25]:
earth_object['average_diameter'] = (earth_object['est_diameter_min'] + earth_object['est_diameter_max'])/2 fig = px.scatter(earth_object, x= "average_diameter", y="relative_velocity", color = "hazardous", title = "Average Diameter and Relative Velocity of Hazardous Asteroids vs Non - Hazardous Asteroids") fig.show()
Out[25]:

Absolute magnitude decides on whether or not an asteroid is harmful

In order to find the relationship between an asteroid's magnitude and its whether it is hazardous or not, a box graph is best since we only have one measurable parameter and a boolean. As illustrated on the graph, we can determine that there is a clear difference and point, in terms of magnitude level, that signifies whether or not an asteroid is dangerous or not.

In [31]:
fig = px.box(earth_object, x="hazardous", y="absolute_magnitude", color="hazardous", title="Absolute magnitude for non-harzardous asteroids vs hazardous") fig.show()
Out[31]:

In this other graph, we decided to try and see if the miss distance had any affect at the classification of an asteroid, again using the box graphs. The graphs show little difference in terms of distance between harmful and harmless asteroids that went by the earth, it even shows that the harmless asteroids had a smaller miss distance compared to their hazardous counterparts.

In [32]:
fig = px.box(earth_object, x="hazardous", y="miss_distance", color="hazardous", title="Miss distance from earth for non-harzardous asteroids vs hazardous") fig.show()
Out[32]:

Conclusion

After studying and visualizing the data in the graphs, we were able to conclude that properties such as relative velocity and absolute magnitude are the deciding factors of a hazardous asteroid, and that over time the frequency of such asteroids has risen. Our hypothesis fell short in terms of the diameter or size of an asteroid being one of those deciding factors, which we found to be false after analyzing the data.

In [0]: